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Abstract 

This  paper  discusses  two  algorithms  for  non- 
llnearly  constrained  optimization.  These  algo- 
rithms — the  penalty  and  barrier  trajectory  algo- 
rithms --  are  based  on  an  examination  of  the  tra- 
jectories of  approach  to  the  solution  that  charac- 
terize the  quadratic  penalty  function  and  the 
logarithmic  barrier  function,  respectively.  Al- 
though closelv  related  in  principle,  the  two  algo- 
rithms display  important  differences  in  their 
implementation  as  well  as  in  the  properties  of  the 
gencr.it  * d iterates.  The  discussion  will  emphasize 
'he  numerical  aspects  of  implementation  of  the 
joctory  algorithms,  with  particular  attention 
to  :he  choice  of  reliable  methods  for  carrying 
out  “he  required  computations. 


1 . Introduction 

The  problem  to  be  considered  in  this  paper 
is  the  following: 

Pi:  minimize  F(x) , x € E° 

subject  to  Cj(x)>0#  1 “ 1,2,. ..,m, 

where  F(x)  and  fCj(x)}  are  prescribed  non- 
linear functions.  The  function  F(x)  is  usually 
termed  the  "objective  function"  and  the  set 
(Cj(x)!  is  the  set  of  "constraint  functions".  It 
will  be  assumed  for  simplicity  that  F and  {c^} 
are  twice  continuously  differentiable,  although 
the  methods  to  be  discussed  will  cope  with  occa- 
sional discontinuities. 

The  solution  of  PI  will  be  denoted  by  x*. 

In  all  problems  to  be  considered,  the  first-  and 


second-order  Kuhn-Tuckef  conditions  are  assumed  to 
be  satisfied  at  x*,  so  that  there  exists  a vector 
X*  of  non-negative  Lagrange  multipliers,  corre- 
sponding to  the  active  constraints  at  x*,  satisfy- 
ing 

g(x*>  - A(x*)X*  - 0 (1) 


where  g(*)  is  the  gradient  of  F,  and  the  columns 
of  A(*)  are  the  gradients  of  the  constraints 
active  at  x*. 

It  is  customary  to  state  problem  PI  with  two 
kinds  of  constraints  - lity  constraints  (of 

the  form  (x)  * 0 1 as  the  inequalities 

given  above  in  PI.  . .x>  Jtinction  will  not  be 
made  during  the  present  discussion,  in  order  to 
avoid  the  introduction  of  additional  notation.  The 
treatment  of  inequality  constraints  is  typically 
more  complicated  than  that  of  equality  constraints, 
and  the  algorithms  to  be  discussed  can  deal  in  an 
obvious  way  with  equality  constraints. 

Because  the  problem  PI  can  not,  in  general, 
be  solved  explicitly,  a popular  approach  during 
the  past  decade  has  been  to  transform  PI  into  a 
sequence  of  unconstrained  minimization  problems. 

The  most  common  such  transformation  has  been 
effected  bv  the  use  of  penalty  and  barrier  function 
methods,  which  are  discussed  at  length  in,  for 
example,  Fiacco  and  McCormick  (1968)  and  Ryan 
(1974).  This  paper  will  not  review  these  methods, 
but  their  properties  are  critical  In  the  deriva- 
tion of  the  algorithms  to  be  discussed. 

The  quadratic  penalty  function  corresponding 
to  the  problem  PI  Is  defined  bv 


p(x,p)  F(x)  + 2 ^ (c((x))1 2  , 


1 


<2.0 


where  I U a subset  of  the  Indices  (1,2,. 
(usually,  the  set  of  constraints  whose  values  are 
less  than  a small  positive  number),  and  p Is  a 
positive  scalar  termed  the  "penalty  parameter". 

The  quadratic  penalty  function  may  also  be  written 
as: 

P(x,r)  : F(x)  4r''[  (c.(x))2  . (2b) 

2 iei  1 

where  r - 1/p.  Let  x*(r)  denote  an  unconstrained 
minimum  of  P(x,r). 

The  logarithmic  barrier  function  corresponding 
to  the  problem  PI  is  given  by: 

m 

B(x,r)  s F(x)  - t l tn(c  (x))  , (3) 

1-1  1 

where  r is  a positive  scalar  termed  the  "barrier 
parameter".  The  logarithmic  barrier  function  Is 
defined  only  at  points  that  strictly  satisfy  all 
of  the  problem  constraints.  Let  x*(r)  denote 

D 

an  unconstrained  minimum  of  B(x,r). 

L'nder  certain  mild  conditions,  there  exists 
r > 0 such  that  for  r < r,  x*(r)  and  x*(r) 
are  continuous  functions  of  r,  and: 

lim  x*(r)  ■ x*  ; 
r * 0 V 

lim  x*(r)  - x*  . 
r ► 0 B 

Although  penalty  and  barrier  function  methods 
display  several  good  features,  they  suffer  from 
certain  theoretical  and  numerical  defects  — in 
particular,  both  require  the  solution  of  a 
theoretically  infinite  sequence  of  unconstrained 
problems.  Furthermore,  in  practice  each  suc- 
ressivc  unconstrained  sub-problem  is  more  dif- 
ficult to  solve,  because  the  Hessian  matrices  of 
the  penalty  and  barrier  functions  become  increas- 
ingly iil-conditi onvd  as  r approaches  zero,  and 
are  singular  in  the  limit  (see  Murray,  1969a). 

The  continuous  lines  of  minima  in  En 

defined  by  x*(r)  and  x*(r)  are  termed  the 
r B 

"penalty  trajectory"  and  "barrier  trajectory"  of 
approach  to  x*,  respect  1 vel y . The  analysis  of 
these  trajectories,  and  & detailed  description  of 
the  trajectory  algorithms,  have  been  given  else- 
where (Murray,  1969a, h;  Wright,  1976;  Murray  and 
Wright,  1976h);  for  the  purposes  of  the  present 


discussion,  only  a brief  description  of  the 
underlying  motivation  will  be  stated. 

The  trajectory  algorithms  are  based  on  using 
the  properties  of  the  trajectories  to  generate  a 
sequence  of  Iterates  that  lie  in  a neighborhood 
of  the  appropriate  trajectory,  in  order  to  mimic 
the  approach  to  x*  of  the  iterates  from  a penalty 
or  barrier  function  method.  Because  it  is  possible 
to  characterize  a step  toward  the  penalty  or  barrier 
trajectory  without  assuming  that  the  current  iterate 
is  in  a close  neighborhood  of  x*,  the  derivation 
of  the  trajectory  methods  does  not  display  a 
stringent  dependence  on  properties  that  hold  only 
In  such  a neighborhood.  Moreover,  it  is  not  neces- 
sary for  any  of  the  Iterates  to  lie  exactly  on  the 
trajectory  (as  in  a penalty  or  barrier  function 
method) . 

At  each  Iteration  of  a trajectory  method,  the 
search  direction  Is  computed  as  a step  toward  some 
point  on  the  desired  trajectory.  The  particular 
point  to  be  aimed  for  depends  on  the  current  value 
of  the  penalty  or  barrier  parameter.  The  solution 
x*  Is  also  on  the  trajectories,  and  the  target 
point  will  ultimately  become  arbitrarily  close  to 
x*  as  the  algorithms  converge.  The  penalty  or 
barrier  parameter  may  be  adjusted  at  each  iteration 
of  the  trajectory  methods;  however,  the  choice  of 
the  parameter  value  is  not  critical,  since  a step 
to  a neighborhood  of  a point  on  the  trajectory 
corresponding  to  r is  also  in  the  neighborhood 
of  a point  corresponding  to  (l+t)r,  where  t Is 
small.  The  numerical  procedure  for  determining 
the  search  direction  in  both  algorithms  Is  well- 
posed,  and  the  approach  to  the  limit  of  the  penalty 
or  harrier  parameter  does  not  cause  any  lll-condl- 
t ioning. 

This  paper  will  emphasize  some  numerical 
aspects  of  implementation  of  the  trajectory  methods, 
with  particular  attention  to  the  choice  of  reliable 
procedures  for  carrying  out  the  required  computa- 
tions. The  emphasis  on  the  details  of  Implementa- 
tion Is  deliberate;  even  within  an  algorithm  that 
has  been  designed  from  the  outset  to  be  robust, 
additional  safeguards  arc  necessary  to  protect 
against  failure  or  illogical  results  when  the  under- 
lying assumptions  are  hot  satisfied. 
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2 . Description  of  Trajectory  Algorithms 

Only  certain  key  aspects  of  Implementation  of 
tlu>  trajectory  metliods  have  been  selected  for  dis- 
cussion In  Section  3.  Accordingly,  the  descrip- 
t Ions  given  here  of  the  penalty  and  barrier  trajec- 
tory algorithms  are  slightly  abbreviated,  and  do 
not  contain  all  the  computational  details.  A 
complete  description  of  both  algorithms  is  given 
in  Murray  and  Wright  (1976b). 


for  the  quadratic  penalty  function;  the  step  to 
be  taken  along  the  search  direction  is  then  chosen 
to  achieve  an  acceptable  decrease  in  the  penalty 
function. 

2.1.2.  Calculation  of  the  search  direction 

At  the  beginning  of  the  k-th  iteration  of  the 
penalty  trajectory  algorithm,  the  following  vectors 
and  matrices  are  assumed  to  be  available: 


2.1.  Penalty  Trajectory  Algorithm 

2.1.1.  Properties  of  the  search  direction 
For  the  penalty  trajectory  algorithm,  at 

each  iteration  tlie  search  direction,  p,  is 
(ideally)  constructed  as  the  solution  of  the  follow- 
ing quadratic  program: 

QP1:  min  ^ pT  Sp  + pTg 

*T  “ i 

subject  to  A = -c  - — A , 

P P 


«(k), 

.00 

“ I 

.00 


A 00 
A , 


s(k>. 


an  approximation  to  x*; 

the  vector  of  values  of  {c,(x)l 

(k)  1 

evaluated  at  x ; 

the  gradient  vector  of  F(x) 

(k) 

evaluated  at  x ; 

the  matrix  whose  columns  are  the 
gradients  of  (c,(x)}  evaluated 
at  x(k); 

an  approximation  to  the  Hessian 
matrix  of  the  Lagrangian  function 
„00 


where  c denotes  the  vector  of  constraints  cur- 
rently considered  "active";  A is  a matrix  whose 
columns  are  the  gradients  of  the  active  constraints; 
A fs  an  estimate  of  the  Lagrange  multiplier  vector; 
a is  the  current  value  of  the  penalty  parameter; 
g is  the  gradient  of  F;  and  S is  a matrix  that 
approximates  the  Hessian  of  the  Lagrangian  function. 

Let  Y be  a matrix  whose  columns  form  an 
orthogonal  basis  for  the  range  of  the  columns  of  A, 
and  let  Z be  a matrix  whose  columns  form  an 
orthogonal  basis  for  the  corresponding  null  space, 
i .e. , , 


atz  - 0 , 


The  procedures  followed  during  the  k-th 
iteration  to  compute  the  next  iterate  are: 

(1)  An  "active  set"  of  constraints  is  deter- 
mined, containing  < n elements  (see  Section  3.1 
for  a discussion  of  the  case  where  the  active  set 
contains  more  than  n elements).  The  vector  of 
active  constraint  values  will  be  denoted  by  c, 
and  the  matrix  whose  columns  are  the  gradients  of 
those  constraints  will  be  denoted  by  A. 

(2)  Factorize  A such  that 


QA  - 


T 

Q Q - I • 


If  A has  full  column  rank  (<_  n) , and  if 
T 

the  matrix  Z SZ  is  positive  definite,  then  the 
solution  of  QP1,  p*,  can  be  uniquely  expressed  as 
the  sum  of  two  orthogonal  components: 


where  R is  an  upper  triangular  matrix.  Define 
the  matrices  Y and  Z by  partitioning  Q as: 


P*  ’ YpR  + ZpN  * 

For  sufficiently  large  p,  the  search  direction  p* 
so  constructed  will  always  be  a descent  direction 


(3)  Determine  an  estimate,  A,  of  the  Lagrange 

multiplier  vector.  If  A has  full  solumn  rank, 

* (k)  2 

X is  the  least-squares  solution  of  mlnlAX-g  Ij, 
and  is  given  by  the  solution  of  the  triangular 


3 


system: 


(6) 


RA 


yT  g(k) 


If  A Is  rank-deficient,  the  vector  A will  be 
taken  as  the  minimum-length  least-squares  solution; 
it  is  obtained  by  extending  the  factorisation  of 
Step  (2)  to: 


(A) 


whore  R is  a non-singular  upper  triangular 
matrix  and  V is  an  orthogonal  matrix  (see 
Peters  and  Wilkinson,  1970,  for  further  details). 

(4)  Determine  an  appropriate  value  of  the 
penalty  parameter,  p (Murray  and  Wright,  1976b). 

(5)  Compute  the  vector  pR,  as  follows.  If 
A has  full  rank,  pR  is  obtained  by  solving  the 
linear  system: 

ATp  = A1  YPr  - RTpR--c  - j A . (5) 

In  this  way,  the  direction  Yp^  satisfies  the 
linear  equaJity  constraints  of  QP1.  If  A is 
rank-deficient,  pR  is  a least-squares  solution 
of  (5),  computed  using  the  complete  orthogonal 
factoriz.it ion  (4);  the  linear  constraints  of  QP1 
will  then  not  be  exactly  satisfied. 

(6)  Determine  the  modified  Cholesky  factor- 

T Ik) 

ization  of  the  matrix  2 /Z.  With  this  pro- 

T (k) 

cedure,  the  matrix  Z S Z is  augmented  (if 

necessary)  by  a positive  diagonal  matrix,  E, 

T (k^ 

chosen  to  make  (Z  S Z + E)  strictly  (numer- 

T 

ically)  positive  definite.  Let  LDL  be  the 
computed  factorization,  so  that: 

T T (k) 

LDL  =■  7 Sv  fZ  + E , 


T (k) 

where  E is  identically  zero  if  Z Sv  rZ  is 
sufficiently  positive  definite  (Gill  and  Murray, 
1972a). 

(7)  Determine  the  vector  pkl  by  solving 

N 


i-0LT  p„ 


_ZT  g00 


Test  whether: 


* PN*  £ MlpRl  and  lpRl  £ , 

for  M a reasonably  large  positive  number  (say, 

1,000). 

(a)  If  the  test  (6)  is  satisfied  (as  it 
almost  always  is  in  practice),  compute  pR  by 
solving: 

LDLT  Pn  - -ZT(g(k)  + S(k)  YpR)  ; 
then  form  the  search  direction  as: 

P - YpR  + zpN  . 

(b)  If  the  test  (6)  is  not  satisfied,  the 
two  orthogonal  portions  of  the  search  direction 
are  not  well-scaled,  and  the  following  re-scaling 
procedure  is  used  to  adjust  for  the  Imbalance. 

If  I?,!  > define  a scaling  factor  8^ 

as 


MlV 

•5.1 


and  let 


P 


YPR  + 


P,  Zp 


N 


otherwise,  define  a scaling  factor  as 


and  let 


p - P2  YPr  + ZpN  . 

(8)  Determine  a step  length,  a,  that  generates 
an  acceptable  reduction  in  the  penalty  function 
P(x,  p),  using  a safeguarded  cubic  or  parabolic 
step  length  algorithm  (e.g.,  the  procedure  de- 
scribed in  Gill  and  Murray,  1974).  Special  care 
must  be  exercised  in  the  step  length  algorithm,  to 
avoid  difficulties  if  P(x,  p)  is  unbounded 

below  along  the  given  search  direction  (see 
Section  3.5.1). 

(9)  Set  *<k+1>  to  x<k)  + up,  and  return 
to  Step  (1) . 
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2.2.  Barrier  Trajectory  Algorithm 

2.2.1.  Properties  of  the  search  direction 

The  search  direction  of  the  barrier  trajectory 
algorithm  is  (ideally)  constructed  as  the  solution 
of  the  following  quadratic  program: 

IT  T 

QP2:  min  y p Sp  + p g 

*T 

subject  to  A p " d , 

where  d^  - -c^  + r/A^;  c denotes  the  vector  of 
constraints  currently  considered  "active";  A is 
a matrix  whose  columns  are  the  gradients  of  the 
active  constraints;  A is  an  estimate  of  the 
Lagrange  multipliers  corresponding  to  the  active 
constraints;  r is  the  current  value  of  the  barrier 
parameter;  g is  the  gradient  of  F;  and  S is 
an  approximation  to  the  Hessian  of  the  Lagranglan 
function. 

It  Is  essential  to  achieve  a reduction  in  the 
barrier  function  B(x,r)  at  each  iteration, 
because  the  barrier  function  serves  as  a conven- 
ient "merit  function"  for  measuring  progress  toward 
x*.  The  derivation  of  the  barrier  trajectory  algo- 
rithm indicates  that  the  search  direction  given  by 
the  solution  of  QP2  may  not  always  be  a descent 
direction  for  B(x,r).  Therefore,  the  null-space 
component  of  the  search  direction  may  alternatively 
be  computed  to  minimize  a quadratic  approximation 
to  the  Lagranglan  function,  independent  of  the 
component  in  the  range  of  the  columns  of  A.  This 
alternative  formulation  of  the  search  direction  is 
necessary  because  of  the  quite  different  roles  of 
the  penalty  and  barrier  parameters  as  the  solution 
is  approached. 

2.2.2.  Calculation  of  the  search  direction 

At  the  beginning  of  the  k-th  iteration  of  the 
barrier  trajectory  algorithm,  the  same  vectors 
and  matrices  are  available  as  for  the  penalty  tra- 
jectory algorithm.  The  iterates  generated  by  the 
barrier  trajectory  algorithm  necessarily  lie  In 
the  strict  interior  of  the  feasible  region;  this 
algorithm  is  Intended  for  use  on  problems  where 
some  or  all  of  the  problem  functions  may  be  ill- 
defined  or  undefined  outside  the  feasible  region. 

The  computational  procedures  followed  during 
the  k-tli  Iteration  are: 


(1)  Determine  the  set  of  "active"  constraints, 
denoted  by  c (see  Section  3.2);  form  the  matrix 

* (It) 

A,  whose  columns  are  the  columns  of  A corre- 
sponding to  the  active  set.  By  construction,  A 
has  ^n  columns. 

(2)  Factorize  A such  that 


T 

Q Q “ I . 


as  before. 

(3)  Determine  the  Lagrange  multiplier  esti- 
mate X by  solving: 


RA 


yT  g(k) 


so  that  X is  the  least-squares  solution  of 
minlAA-g^l^;  a similar  procedure  to  that  given 
for  the  penalty  trajectory  algorithm  is  followed 
if  A is  rank-deficient.  If  one  or  more  compon- 
ents of  A are  negative,  the  constraint  corre- 
sponding to  the  most  negative  Is  deleted  from  the 
active  set;  the  modified  A is  then  factorized, 
and  the  new  A vector  calculated  for  the  re- 
defined active  set.  Since  the  new  A Is  simply 
the  previous  A with  one  column  deleted,  the  new 
factorization  can  be  obtained  by  a simple  updating 
scheme  (Gill,  Golub,  Murray,  and  Saunders,  1974). 

(4)  Determine  the  barrier  parameter,  r 
(Murray  and  Wright,  1976b). 

(5)  Determine  the  vector  d according  to 
the  following  rules.  Define  s ■ Id  + Iz^g^^l, 
and  set  c - y r/s,  where  y > 1 (say,  2).  Let 
r be  the  barrier  parameter  from  the  previous 
iteration;  then: 

if  Xl  > c,  set  dt  • -Cj  + j-  ; 

if  Aa  -c , set  dj  ■ -(1  - -r)  c1; 

otherwise,  dj  • -e  . 

(6)  Compute  the  vector  pR,  which  Is  the 
solution  of  the  linear  system: 

*T  r 

A Yp„  - R pR  - d . 
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In  this  way,  pR  satisfies  the  desired  linear 
equality  constraints  of  QP2  for  those  problem 
constraints  for  which  Is  sufficiently  positive; 

an  alternative  relationship  is  satisfied  for  each 
"active"  constraint  that  has  an  insufficiently 
positive  multiplier  estimate.  Again,  the  rank- 
deficient  case  is  treated  as  for  the  penalty  tra- 
jectory algorithm. 

(7)  Compute  the  modified  Cholesky  fnctoriza- 
T (k) 

tion  of  7.  S 7.  (as  in  the  penalty  trajectory 
algorithm);  the  factorization  will  be  denoted  by 


(8)  Determine  p^  by  solving: 


lult  pn  . -zT  g(k) 


Test  whether: 


llpNl!  • MllpRl  and  tpRl  <_  Mlpjjl  , (7) 


for  M a reasonably  large  positive  number. 

(a)  If  the  test  (7)  is  satisfied,  obtain 
PN  by  solving 


LDLT  p„ 


-zVk) 


+ S(k)  Ypn) 


and  define  the  trial  search  direction  as: 


P - VpR  + ZpN  . 

If  p is  not  a descent  direction  for  B(x,r), 
re-define  p as: 


This  latter  definition  is  guaranteed  to  yield 
a descent  direction  for  B(x,r). 

(b)  If  the  test  (7)  is  not  satisfied,  then 
adjust  the  scaling,  as  in  the  penalty  trajectory 
algor  1 thm. 

(9)  Determine  a step  length,  a,  that 
accomplishes  a suitable  reduction  in  B(x,r), 
using  special  procedures  designed  for  one-dimen- 
sional minimization  with  respect  to  the  logarithmic 
barrier  function  (see  Section  3.5.2).  During  the 
search  procedure,  record  whether  violation  of  any 
constraint  currently  considered  "inactive"  restricts 
tin-  step  length;  if  so,  tills  constraint  will  be 


added  to  the  active  sat  at  the  next  Iteration. 

(11)  Set  x(k+1)  to  x(k>  + up,  and  return 
to  Step  (1). 


3 . Some  Considerations  of  Numerical  Analysis  in 
Implementation  of  the  Trajectory  Algo rltlims 
In  this  section,  we  consider  some  selected 
aspects  of  implementation  of  the  trajectory  methods 
from  the  viewpoints  of  numerical  analysis  and 
algorithm  definition.  It  will  be  stressed  through- 
out this  discussion  that  an  Implementation  could 
not  achieve  practical  success  if  the  definition  of 
the  algorithm  depended  critically  on  properties 
that  hold  only  in  a close  neighborhood  of  x*;  the 
algorithm  should  produce  sensible  results,  even 
when  such  conditions  are  not  satisfied  at  the 
current  point . 


3.1.  Selection  of  the  Active  Set  for  the  Penalty 
Trajectory  Algorithm 

The  "active  set”  of  constraints  is  defined 
at  each  iteration  of  the  penalty  trajectory  algo- 
rithm as  the  set  of  constraints  whose  values  are 
less  than  a specified  small  positive  number.  With 
this  definition,  the  active  set  is  equivalent  to 
the  "violated  set",  and  can  easily  be  determined. 
Such  a strategy  is  reasonable  because  the  penalty 
trajectory  algorithm  is  based  on  properties  of  the 
quadratic  penalty  function,  and  for  a sufficiently 
large  value  of  p,  the  set  of  constraints  violated 
at  x*(P)  is  Identical  to  the  set  of  constraints 
active  at  x*  (Fiacco  and  McCormick,  1968).  After 
the  first  few  Iterations,  the  active  set  typically 
remains  fixed  for  the  rest  of  the  computation. 

It  was  noted  in  the  definition  of  the  algo- 
rithm that  a special  procedure  is  used  when  more 
than  n constraints  are  violated  at  the  beginning 
of  an  iteration.  If  more  than  n constraints  are 
violated,  and  A has  full  rank,  the  search  direc- 
tion, p,  is  chosen  to  attempt  to  minimize 
lc(x+p)^,  by  computing  p ns  the  solution  of 
rainlc  + A pi  . The  search  direction  in  this  case 

ia  calculated  as  follows: 

*T 

(1)  Factorize  A in  the  form 


qtq  •*  I . 
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where  R is  upper  triangular  and  non-singular. 

T* 

(2)  Solve  Rp  * -Y  c,  where  the  columns  of 
Y are  given  by  the  first  n rows  of  Q. 

The  computational  procedure  is  very  similar 
to  the  calculation  of  the  Lagrange  multiplier 

estimates,  and  relies  on  orthogonal  transformations 
T 

to  reduce  A to  upper  triangular  form. 

Normally,  the  condition  that  more  than  n 
constraints  are  violated  occurs  because  the  current 
point  is  a poor  estimate  of  the  solution,  and 
does  not  hold  at  the  next  iterate.  However,  it  is 
conceivable  that  this  condition  could  exist  even 
at  x*,  so  that  possibly  every  iteration  might  be 
special.  In  this  case,  the  Hessian  matrix  of  the 
penalty  function  is  not  ill-conditioned  as  the 
penalty  parameter  approaches  its  limit.  This 
special  procedure  has  the  same  effect  as  choosing 
( * in  the  usual  definition  of  the  algorithm, 
and  hence  is  equivalent  to  the  Gauss-Newton  method. 


3.2.  Sele ctlon  of  the  Active  Set  for  the  Barrier 
T raj .* ‘ 1 tory  A1  gori thm 

file  criteria  for  selecting  the  active  set  in 
the  barrier  trajectory  ithm  are  not  so 

straightforward  as  i:  alty  trajectory  algo- 
rithm. Because  tl  ajectory  algorithm 

is  a feasible-po  problem  constraints 

are  satisfied  at  ration,  and  the  subset 

of  active  constraints  must  be  determined  by  analysis 
of  the  behavior  of  the  constraints  as  the  computa- 
tion proceeds. 

The  procedure  for  determining  the  initial 
active  set  is  described  in  Murray  and  Wright 
(1976b),  and  has  been  satisfactory  on  the  examples 
tested.  The  active  set  tends  to  be  altered  only 
during  the  early  iterations,  because  of  the  possibil- 
ity of  misleading  local  Indications  that  certain  con- 
straints arc  active.  The  following  rules  are  used 
to  modLfy  the  active  set  at  each  iteration: 

(1)  The  constraint  corresponding  to  the  most 
negative  Lagrange  multiplier  estimate  (if  one 
exists)  is  deleted,  and  the  remaining  multipliers 
are  modified  accordingly. 

(2)  If  any  constraint  considered  active 
appears  to  be  bounded  away  from  zero  as  the  solu- 
tion is  approached,  it  is  deleted  from  the  active 
set.  This  decision  is  highly  dependent  on  scaling; 
further  discussion  is  given  in  Murray  and  Wright 


(1976b). 

(3)  If  the  step-length  algorithm  was  re- 
stricted because  a supposedly  Inactive  constraint 
was  violated,  this  constraint  is  added  to  the 
active  set  at  the  beginning  of  the  next  Iteration, 
and  will  not  be  deleted  during  that  iteration, 
regardless  of  the  sign  of  its  multiplier  estimate. 

(4)  If  the  number  of  active  constraints 
exceeds  n following  the  addition  of  (3),  the  con- 
straint with  the  largest  value  of  c^(x)  is 
deleted  from  the  active  set  (a  further  scaling- 
dependent  decision) . 

3.3.  Use  of  Orthogonal  Factorizations 

A factorization  involving  reduction  of  A to 
triangular  form  by  application  of  orthogonal 
transformations  is  used  in  several  steps  of  the 
trajectory  algorithms.  Such  a factorization  is 
convenient  and  reliable  for  computation,  and  has 
many  advantages  over  alternative  procedures.  For 

example,  in  some  algorithms  for  solving  PI,  the 

~T' 

matrix  A A is  formed  and  used  to  solve  linear 
systems;  the  poor  numerical  properties  of  this 
strategy  are  well-known  (see  Peters  and  Wilkinson, 
1970),  especially  the  possible  squaring  of  the  con- 
dition number  that  may  result  from  the  formation 
of  A A.  Furthermore,  if  the  matrix  A does  not 

/'T'' 

have  full  rank,  A A is  singular,  and  some  steps 
of  the  algorithm  may  then  be  undefined. 

Computation  of  the  complete  orthogonal  fac- 
torization of  A means  that  steps  of  the  trajectory 
algorithm  can  be  defined  (with  relatively  little 
extra  work)  even  if  A is  rank-deficient  (see 
Sections  3.3.1  and  3.3.2).  Although  only  the 
singular  value  decomposition  can  fully  reveal  the 
closeness  of  the  columns  of  A to  linear  dependence, 
the  complete  orthogonal  factorization  provides  ade- 
quate information  in  many  applications  (see  Golub, 
Klema,  and  Stewart,  1976),  since  the  conditioning 
of  the  triangular  matrix  R serves  to  Indicate 
the  "conditioning"  of  A. 

3.3.1.  Calculation  of  a Lagrange  multiplier 
estimate 

The  Lagrange  multiplier  estimate  at  each 
iteration  of  the  trajectory  algorithms  Is  computed 
as  a least-squares  solution  of  alnlAX-gljl  this 
first-order  estimate  Is  acceptable,  since  the 
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local  rate  of  convergence  of  the  trajectory  methods 
is  not  restricted  to  the  rate  of  convergence  of 
the  multipliers  (see  Wright,  1976).  The  orthogonal 
factorization  of  A can  be  used  to  calculate  a 
minimum-length  least-squares  solution,  even  when  A 
does  not  have  full  column  rank;  this  alternative 

is  not  possible  with  techniques  that  involve  forming 

'T~ 

A A. 

When  reducing  A to  upper  triangular  form, 
column  interchanges  are  carried  out  so  that  the 
reduced  column  of  largest  magnitude  is  selected 
as  the  next  column  to  be  reduced;  the  matrix  is 
considered  to  be  numerically  rank-deficient  if 
the  norms  of  all  unreduced  columns  are  less  than 
a prescribed  tolerance.  In  this  way,  all  diagonal 
elements  of  R are  bounded  below  by  the  specified 
tolerance.  Although  the  ill-conditioning  of  R 
does  not  necessarily  reveal  itself  by  the  presence 
of  a diagonal  element  that  is  very  small  relative 
to  the  largest  diagonal  element,  prevention  of  a 
too-small  diagonal  element  is  sufficient  in  many 
cases  to  control  serious  ill-conditioning  of  R. 

3.3.2.  Calculation  of  the  search  direction 

The  search  direction  in  both  trajectory 
algorithms  is  computed  in  two  orthogonal  components 
— one  in  the  range  of  the  columns  of  A,  the  other 
in  the  corresponding  null  space.  This  definition 
results  from  the  characterization  that  the  search 
direction  must  satisfy  a set  of  linear  equality 
constraints  of  the  form: 

ATp  * b , (8) 

where  b is  some  vector  depending  on  the  algorithm. 
If  A has  full  rank,  these  equality  constraints 
uniquely  determine  the  component  of  p In  the 
range  of  the  columns  of  A,  which  is  calculated 
as  follows. 

The  orthogonal  matrix  Q that  reduces  A 
to  upper  triangular  form  Is  explicitly  formed, 
by  multiplying  out  the  orthogonal  transformations 
used  in  the  reduction.  Once  Q is  available,  its 
rows,  appropriately  partitioned,  provide  the  re- 
quired orthogonal  bases  for  the  range  and  null 
space  of  the  columns  of  A. 

In  the  full  rank  case,  it  is  straightforward 
to  compute  the  component  of  p in  the  range  of  A. 


Since  p - Yp  + Zp  , the  equality  constraints  (8) 

K N 

imply: 

atp  - AT(YPr  + ZpN)  - at  yPr  - rtPr  - b , 

which  gives  pR  as  the  solution  of  a non-singular 
triangular  system. 

If  A is  rank-deficient,  the  component  pR 
may  be  obtained  as  the  least-squares  solution  of 
minlA^p  - bl  , again  using  the  complete  orthogonal 
factorization  of  A. 

In  either  case,  the  calculation  of  pR  is 
completely  straightforward. 

3.4.  Approximation  of  the  Hessian  of  the  Lagrangian 
Function 

Alternative  techniques  for  approximating  the 
Hessian  of  the  Lagrangian  function  under  various 
circumstances  will  not  be  discussed  in  any  detail 
(see  Murray  and  Wright,  1976b,  for  such  a discus- 
sion) , but  we  shall  consider  one  key  property  of 
the  Hessian  approximation. 

The  assumed  second-order  Kuhn-Tucker  conditions 
T 

imply  that  the  matrix  Z WZ  must  be  positive 
definite  at  x*,  where  Z is  defined  in  terms  of 
A(x*) , and  W is  the  Hessian  of  the  Lagrangian 
function;  however,  W itself  need  not  be  positive 
definite,  or  even  non-singular,  at  x*  or  in  any 
neighborhood  of  x*. 

Certain  approaches  to  solving  PI  impose 
additional  conditions  on  related  matrices  — for 
example,  methods  involving  augmented  Lagrangian 
functions  (see  Powell,  1969;  Fletcher,  1974) 
require  that  the  penalty  parameter  be  large  enough 
so  that  the  Hessian  of  the  augmented  function  is 

positive  definite.  In  both  trajectory  algorithms, 

T 

however,  only  the  projected  Hessian,  Z SZ,  must  be 

positive  definite  at  every  iteration,  in  order  for 

the  solutions  of  the  quadratic  programs  QP1  and 

QP2  to  be  bounded.  The  vector  p is  the  solution 

N 

of  a linear  system: 

ZTSZ  PN  - ZTd  , (9) 

for  some  vector  d,  and  should  be  the  step  to  the 
minimum  of  a quadratic  function  related  to  the 
Lagrangian  function. 
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Accordingly,  the  matrix  used  to  calculate 

is  always  maintained  as  numerically  positive  defi- 
T 

nite.  When  Z SZ  is  updated  by  a quasi-Newton 
technique,  positive  definiteness  is  maintained  by 
updating  the  Cholesky  factorization  of  the  pro- 
jected matrix,  as  in  revised  quasi-Newton  methods 

for  unconstrained  minimization  (Gill  and  Murray, 

T 

1972b).  When  Z SZ  is  obtained  from  exact,  or 
finite-difference  approximations  to,  second  deriv- 
atives, the  modified  Cholesky  factorization  of 
Z SZ  is  computed  in  order  to  solve  the  system  (9). 

In  all  cases,  the  matrix  used  to  solve  (9)  for  p^ 
T 

is  represented  as  LDL  , where  L is  unit  lower 
traingular,  and  D is  a diagonal  matrix  with  all 
elements  strictly  positive.  Such  a procedure 
assures  that  the  portion  of  the  search  direction 
in  the  null  space  of  the  columns  of  A is  always 
well-defined,  and  bounded. 

3.5.  Stc p-Length  Algorithms 
3.5.1.  Detection  and  correction  of  unbounded 
decrease  of  p enalty  function 
Even  when  the  problem  PI  has  a bounded  solu- 
tion, the  corresponding  penaLty  function  or  aug- 
mented Lagrangian  function  may  be  unbounded  below, 
for  arbitrarily  large  values  of  the  penalty  param- 
eter (Powell,  1972).  Accordingly,  when  executing 
a one-dimensional  minimization  with  respect  to  a 
penalty  function  or  augmented  Lagrangian  function, 
care  must  be  exercised  to  avoid  the  possibility 
of  taking  an  excessively  large  step. 

In  particular,  the  safeguarded  cubic  or 
quadratic  step-length  algorithms  (Gill  and  Murray, 
1974)  used  in  the  penalty  trajectory  algorithm 
require  specification  of  an  upper  bound  on  the 
step  length.  In  the  current  implementation  of 
the  penalty  trajectory  algorithm,  the  upper 
bound  is  set  to  correspond  to  a step  of  "reason- 
able" size,  rather  than  an  extremely  large  value. 

In  some  cases,  the  upper  bound  may  Impose  an 
unnecessary  limit  on  the  stepsize;  however,  in 
general  such  a restriction  will  cause  no  serious 
loss  of  efficiency  for  the  overall  computation, 
since  the  next  iteration  usually  corrects  the 
possible  poor  scaling  of  the  search  direction. 

This  conservative  strategy  is  considered  to  be 
justified  by  the  extreme  difficulties  that  result 
if  an  enormous  step  is  taken  because  the  penalty 


function  is  unbounded  below:  either  the  next 

iterate  is  completely  unreasonable,  or  a large 
number  of  evaluations  of  the  problem  functions  are 
required  before  the  unboundedness  is  detected. 

In  the  penalty  trajectory  algorithm,  it  is 
considered  that  the  penalty  function  may  be  unbound- 
ed along  the  given  direction  if  the  step  taken  is 
the  specified  upper  bound.  Almost  always,  the 
indicated  unboundedness  can  be  eliminated  simply 
by  increasing  the  penalty  parameter. 

3.5.2.  Special  techniques  for  the  barrier 
trajectory  algorithm 

At  each  iteration  of  the  barrier  trajectory 
algorithm,  a step-length  algorithm  is  executed  with 
respect  to  the  logarithmic  barrier  function,  which 
thus  serves  as  a "merit  function"*  Several  authors 
(Fletcher  and  McCann,  1969;  Lasdon,  et  al.,  1973) 
have  noted  the  deficiencies  of  standard  step-length 
algorithms,  which  are  usually  based  on  approximation 
by  low-order  polynomials,  when  applied  to  the  log- 
arithmic barrier  function.  Therefore,  the  step- 
length  algorithm  of  the  barrier  trajectory  method 
makes  use  of  special  techniques  that  exploit  the 
known  properties  of  the  logarithmic  barrier  function 
to  allow  more  efficient  estimation  of  an  appropriate 
step  length;  these  techniques  are  based  on  simple 
approximating  functions  that  contain  a logarithmic 
singularity.  Only  a small  amount  of  additional 
calculation  is  required  to  fit  the  special  approx- 
imating functions,  and  their  use  leads  to  a signif- 
icant increase  in  efficiency  of  the  one-dimensional 
minimization,  compared  to  standard  procedures 
(Murray  and  Wright,  1976a). 

4 . Conclusions 

The  penalty  and  barrier  trajectory  algorithms 
are  based  on  the  mathematical  properties  of  the 
approach  to  x*  of  the  successive  iterates  generated 
by  the  quadratic  penalty  function  and  logarithmic 
barrier  function,  respectively.  In  theory,  these 
algorithms  have  several  desirable  properties  — for 
example,  their  derivation  does  not  depend  on  condi- 
tions that  hold  only  in  a close  neighborhood  of  x*, 
and  their  rate  of  convergence  in  the  limit  is  arbi- 
trarily close  to  that  of  linearly  constrained 
Lagrangian  algorithms  (described  in  Robinson,  1972; 
Ro**en  and  Kreuser,  1972).  In  practice,  the  current 


implementations  of  the  trajectory  algorithms  have 
been  successful  on  many  problems,  deliberately 
Including  examples  for  which  the  ideal  assumptions 
are  violated.  The  results  thus  far  indicate  that 
these  algorithms  compare  favorably  with  similarly 
careful  implementations  of  other  algorithms  to 
solve  1*1  (see  Wright,  1976,  for  some  typi  al 
numerical  results). 

The  overall  aim  of  this  paper  has  been  to 
illustrate  some  of  the  cons iderat ions  of  numerical 
analysis  that  enter  the  choice  of  computational 
procedures  for  selected  aspects  of  the  trajectory 
algorithms.  Numerical  analysis  may  not  play  a 
significant  role  in  the  process  of  verifying  that 
the  expected  behavior  of  an  algorithm  under  ideal 
conditions  is  displayed  numerically.  However, 

It  is  an  elementary  fact  of  numerical  analysis 
that  theoretically  equivalent  mathematical  pro- 
cedures do  not  yield  equivalent,  or  even  close, 
numerical  results,  and  it  is  an  elementary  fact 
of  life  that  hoped-for  conditions  are  not  always 
satisfied.  Considerations  of  numerical  analysis 
should,  therefore,  be  applied  to  every  aspect  of 
the  definition  and  implementation  of  optimization 
algorithms  in  general. 
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